AMENDMENT TO THE CLAIMS 



1. {currently amended) A computer readable storage media 
storing instructions readable by a computer which, when 
implemented, cause the computer — fee — resolve — — overlapping 
amb iguit y — &4^i^ — tet — em — t^pu-fe — sont-e^ee — e£ — an — aa ■ segm e nted 
jra-R- guagc by— to perform4r^ stops a method comprising: 

segmenting afehe sentence of C hin ese characters into twe 
pos sible — s ^q - menta - t - ie ^ constitu ent Chinese words 
having one or more Chinese characters ; 

recognizing arvfeh^e overlapping ambiguity string in the 
s e gme n t e d-i^ip^t sentence , wherein t he overlapping 
ambiguity string com prises at lea st three Chinese 
characters having at least t wo possible 

segme ntations a-s a #t»e^ie^ &f fete fewe 

6 c gme n ^ fe ions ; 

obtaining probability information based on at least one 
context feature adjacent the overlapping ambiguity 
string; and 

outputting an indication for selecting one of the at 
lea st two p ossible segmentations as a function of 
the obtained probability information. 

2. {Previously presented} The computer readable storage media of 
claim 1, wherein obtaining the probability information 
comprises obtaining probability information from a language 
model based on the at least one context feature and a left or 
right portion of the overlapping ambiguity string. 

3* {Previously presented) The computer readable storage media of 
claim 2 wherein the language model comprises a trigram model, 

4, (currently amended) The computer readable storage media of 



claim 2 wherein outputting an indication for selecting one of 
the at least two possible segmentations comprises classifying 
the probability information. 

5. (Previously presented) The computer readable storage media of 
claim 4 wherein classifying comprises classifying using Naive 
Bayesian Classification. 

6. (Previously presented) The computer readable storage media of 
claim 1 wherein segmenting the sentence comprises performing a 
Forward Maximum Matching (FMM) segmentation of the input 
sentence and a Backward Maximum Matching {BMM) segmentation of 
the input sentence, 

7. (Previously presented) The computer readable storage media of 
claim 6 wherein recognizing the overlapping ambiguity string 

comprises recognizing a segmentation 0 f of the overlapping 

ambiguity string from the FMM segmentation and a segmentation 

O b of the overlapping ambiguity string from the BMM 

segmentation, 

8* (currently amended) The computer readable storage media of 
claim 7^ wherein output t ing the i ndication comprises selecting 
one of the at least two segmentations as4r& a function of a set of 
context features surrounding asooci a-^e-d — w-i-t-h — the overlapping 
ambiguity string. 

9. (currently amended) The computer readable storage media of 
claim 8 wherein the set of context features comprises words or 
grammatical features surrounding a r ound the overlapping ambiguity 
string. 



10. (currently amended) The computer readable storage media of 
claim 3,_ wherein outputting the indica tion selecting one e n§— 4r&e 

4r^e s egment ation s comprises classifying the probability 

information of the set of context features and O f . 

11 ♦ (currently amended) The computer readable storage media of 
claim SjiQ wherein outputting the indication eeleeting one-e f the 

s e - g - meB r^a t ions comprises classifying the probability 

information of the set of context features and O b > 

12. (currently amended) The computer readable storage media of 
claim 8, outputtin g the indica tion comprises wherein — s^iee#arB^ 

eei^ei^if^ determining which of O f or O h has a higher 

probability as a function of the set of context features, 

13. (cancelled) 

14. (Currently amended) A method of segmentation of a sentence 
of Chinese texta^ unoogmcntod -4ra^gBra^e, the sentence having an 
overlapping ambiguity stringH^AS-f, the method com prising : 

generating a Forward Maximum Matching (FMM) 

segmentation of the sentence; 
generating a Backward Maximum Matching (BMM) 

segmentation of the sentence; 
recognizing the overl a pping ambiguity strin g based on a 

difference be t w een the FMM segmentat ion and the BBM 

segmentation; 

obtaining probability information based on at least one 
context feature surrounding the overlapping 
ambiguity str ing and at least part of the 

overlapping ambiguit y string rccogniz od OAS — for each- 

of - the FMM--and---SMM-f- ; and 



outputting an indication for selecting one of the FMM 
segmentation and the BMM segmentation as a function 
of obtained probability information. 

15. (Previously presented) The method of claim 14 wherein 
outputting includes selecting one of the FMM segmentation of the 
overlapping ambiguity string and the BMM segmentation of the 
overlapping ambiguity string based on higher probability. 

16. (Previously presented) The method of claim 15 wherein 
obtaining probability information comprises using an N-gram 
model , 

17. (Previously presented) The method of claim 16 wherein 
obtaining probability information comprises obtaining probability 
information about a first word of the overlapping ambiguity 
string. 

18. {currently amended) The method of claim 16, 4 r?- wherein 
obtaining probability information comprises using probability 
information about a last word of the overlapping ambiguity 
string . 

19. (currently amended) The method of claim 16^ wherein 
obtaining probability information comprises using the N-gram 
model that in cludes probability eemp^ses — u - s - i - n ^ — information 
f cra b out context words surround in g a - reu - n d the overlapping 
ambiguity string. 

20. (currently amended) The method of claim 16^ wherein using 
the N-gram model comprises using trigram probabili ty information 
about a string of words comprising a first word of the 



overlapping ambiguity string and two context words to the left of 
the first word. 

21. (currently amended) The method of claim 16*©v wherein using 
the N-gram model comprises using t rigram probabi lity information 
about a string of words comprising a last word of the overlapping 
ambiguity string and two context words to the right of the last 
word, 

22. (currently amended) The method of claim 14, j rfr wherein 
outputting includes using Naive Bayesian Classifiers. 

23. (currently amended) The method of claim 14^ whe rein 

obtaining probability information comprises tm4 fw^he^ 

eemprisinq roe civing obtaining tri gram pro bability information and 
constr ucting an ensemb le of Nai ve Bayesi an Classifi ers from the 
trig ram p robabilit y inf ormationfeem — a — le^rireai — knowled ge — base 
eeBfiM-s4.-R - € h-a--fe r i g r am, mode l . 

24. (currently amended) The method of claim 23 , wherein 
outp utting an indication com prises id entif yin g on e of the FMM 
segmentation an d the BMM segmentati on based on probability 
calculated from the ~6Hfrd further c omp-g-i-s-i-n.g- -re ee-i-^-i-^g--^T ensemble 
of Naive Bayesian Classifiers. 

25. (currently amended) A method of segmenting a sentence of 
Chine se text — eeB - e - fe ^ ucting — jrfrfe^ a - t-ien — fee — a bsolve — ovcrlap - p - i - R - g 
affite ^ - H - it - y st - rii^gs i-B r-e n unoegmcntod comprising : 

recognizing an overlapping ambiguity string s ^ina^e- in 

the s e n t e n c e » t raining d - a ^e ; 

replacing the ov erlapping ambiguity strings with 

^etee-R-s-f- 
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receiving pro bability informa tion_frgrn_^oncratin t -an N- 
gram language model comprising probability 
information fore ** constituent words of the 
overlapping ambiguity string string e and context 
features surrounding the overlapping ambiguity 
string ; 

resolving the overlapp ing.. ambi^ 

received probability inf ormation e^^i^g-e, 

26. (currently amended) The method of claim 2b^ wherein 
receiviria, or obab i 1 i t y information comprises receiving prc^ability 
ijriformat^ 

^cncra -^jrng-a trigram language model. 

27. (currently amended) The method of claim 25 A and further 
comprising generating an ensemble of classifiers with the 
received probabi lity in formation s — a — functi -ere — — the — H-g^am 
model . 

28. (currently amended) The method of claim 25^ wherein 
recognizing the overlapping ambiguity string st r ings comprises: 

generating a Forward Maximum Matching (FMM) 
segmentation for the ef — e^ete — sentence — irn — fe-he 
tracing data ; 

generating a Backward Maximum Matching 

{BMM} segmentation for the eng — ea^ sentence — in — the 
d raining data ; 

recognizing the overlapping ambiguity string based a** 
OA-S as a f uncfe - i - e -B— e # on the FMM segmenta tion and 
the BMM segmentation- s egmentations of the cach 
sentence--^**- kh c traini n g - d&fra . 
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29. (Original) The method of claim 28 and further comprising 
generating an ensemble of classifiers as a function of the N-gram 
model . 

30. {currently amended) The method of claim 29 wherein 
generating the ensemble of classifiers includes approximating 
probabilities of the FMM and BMM segmentations of the eaek 
overlapping ambiguity string as being equal to the product of 
individual unigram probabilities of individual words in the FMM 
and BMM segmentations respecti vely, of the overlapping ambiguity 
string. 

31. (currently amended} The method of claim 29, ^6- wherein 
generating the ensemble of classifiers includes approximating a 
joint probability of a set of context features conditioned on an 
existence of one of the segmentations of the cach overlapping 

ambiguity string based onas — a — gu - nei -4-eR — e# a corresponding 

probability of a leftmost and a rightmost word of the 
corresponding overlapping ambiguity string. 



